home *** CD-ROM | disk | FTP | other *** search
-
- FILE: swedish.words
- VERSION: DEC-SRC-92-Apr-05
-
- EDITOR
-
- Jorge Stolfi <stolfi@src.dec.com>
- DEC Systems Research Center
-
- AUTHOR OF ORIGINAL WORDLIST
-
- Unknown.
-
- DESCRIPTION
-
- The file swedish.words is a list of about 15,000 Swedish words.
-
- The file has one word per line, and is sorted with sort(1)
- in plain ASCII collating sequence.
-
- The file is supposed to contain all word inflections and verb
- tensens, but it is still extremely incomplete (as one can deduce
- from its size).
-
- Proper nouns are capitalized. Umlauts and circle-accents are
- respectively denoted by a double quote (") and at-sign (@) after
- the modified vowel (A/O/a/o). Besides the letters [a-zA-Z], the
- file uses only double quotes, at-sign, and newline.
-
- AUXILIARY LISTS
-
- In the same directory as swedish.words you will find also:
-
- swedish.trash
-
- A list of 8744 words from the original wordlist that I
- suspect are incorrect or do not belong in swedish.words.
-
- The list consists mostly of (invalid) un-accented versions of
- accented letters. The list also includes abbreviations,
- acronyms, computer slang, obvious typos and misspelllings,
- apparently foreign words, and several words that looked
- suspicious to me.
-
- ORIGINAL LISTS
-
- The original wordlist from which those file was compiled is listed
- below. It was obtained by anonymous FTP on 92-Feb-10.
-
- [1] from: relay.cs.toronto.edu : /doc/Dictionaries
- file: words.swedish.Z
- size: 96169 bytes (200853 bytes uncompressed)
-
- COMMENTS: The list words.swedish.Z [1] uses the characters {}|[]\
- to represent accented letters. However, the list also appears to
- include two additional (invalid) versions of every accented word,
- where the umlauts and circle-accents are either missing or encoded
- by digrams (ae/aa/oe/Ae/Aa/Oe).
-
- COMPILATION PROCESS
-
- The file swedish.words is based on the the file "words.swedish"
- [1], with the characters {}|[]\ mapped to to the letter-accent
- pairs (a"/a@/o"/A"/A@/O").
-
- I also eliminated every word that could be an accentless version
- of an accented word. Since I don't know the language, it is
- likely that I deleted some valid words.
-
- (NON-)COPYRIGHT STATUS
-
- To the best of my knowledge, all the files I used to build these
- wordlists were available for public distribution and use, at least
- for non-commercial purposes. I have confirmed this assumption with
- the authors of the lists, whenever they were known.
-
- Therefore, it is safe to assume that the wordlists in this package
- can also be freely copied, distributed, modified, and used for
- personal, educational, and research purposes. (Use of these files in
- commercial products may require written permission from DEC and/or
- the authors of the original lists.)
-
- Whenever you distribute any of these wordlists, please distribute
- also the accompanying README file. If you distribute a modified
- copy of one of these wordlists, please include the original README
- file with a note explaining your modifications. Your users will
- surely appreciate that.
-
- (NO-)WARRANTY DISCLAIMER
-
- These files, like the original wordlists on which they are based,
- are still very incomplete, uneven, and inconsitent, and probably
- contain many errors. They are offered "as is" without any warranty
- of correctness or fitness for any particular purpose. Neither I nor
- my employer can be held responsible for any losses or damages that
- may result from their use.
-
-